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ABSTRACT 

A novel sensor control solution is presented, formulated 
within a Multi-Bernoulli-based multi-target tracking frame¬ 
work. The proposed method is especially designed for the 
general multi-target tracking case, where no prior knowledge 
of the clutter distribution or the probability of detection pro¬ 
file are available. In an information theoretic approach, our 
method makes use of Renyi divergence as the reward function 
to be maximized for finding the optimal sensor control com¬ 
mand at each step. We devise a Monte Carlo sampling method 
for computation of the reward. Simulation results demon¬ 
strate successful performance of the proposed method in a 
challenging scenario involving five targets maneuvering in a 
relatively uncertain space with unknown distance-dependent 
clutter rate and probability of detection. 

Index Terms — Random finite sets, multi-target filtering, 
sequential Monte Carlo, sensor control, Renyi divergence. 

1. INTRODUCTION 

Sensor control, in general, comprises a multi-object filtering 
process and an optimal decision-making method. The sensor 
control problem, concerned in this paper, is focused on con¬ 
trolling the states of mobile sensors that are used for multi¬ 
target tracking. In any sensor control framework, the con¬ 
trol commands are the direct outputs of the above mentioned 
i decision-making component which is the process of select¬ 
ing the optimal control command. Thus, most of the existing 
solutions are focussed on devising and improving that compo¬ 
nent, assuming that the multi-object filtering framework can 
effectively utilize the sensor measurements and return accu¬ 
rate estimates of the number and states of the targets. How¬ 
ever, the multi-target tracking framework, employed with sen¬ 
sor control, itself plays a significant role in the overall per¬ 
formance of the scheme in terms of real-time accuracy and 
robustness. 

A few sensor control solutions have been recently devel¬ 
oped for multi-target tracking scenarios where tracking is per¬ 
formed by the multi-object filters based on Finite Set Statis¬ 
tics (FISST) theory [1-6]. Similar to classical approaches, the 


general form of Bayesian recursion involved in FISST-based 
methods is not computationally tractable [1]. Consequently, 
several approximations are suggested. PHD filter and its ex¬ 
tended version, CPHD filter [1], Multi-Bernoulli filters [1,7] 
and Vo-Vo prior filter [8] are the most well-known instances 
of such approximations. 

The most common approach for the objective function in 
sensor control is information-driven in which sensor control 
is aimed at improving the information content of the multi¬ 
object distribution by optimizing some measure of informa¬ 
tion gain. The most common choice for an objective function 
in information-driven methods is Renyi divergence function. 
Ristic et al. [2] used Renyi divergence as the objective func¬ 
tion in conjunction with random set filter and PHD-based fil¬ 
ter for the scenarios where clutter rate and uncertainty in sen¬ 
sor Filed of View (FoV) are known. This paper presents a 
novel sensor control solution designed to work within a ro¬ 
bust Multi-Bemoulli-based multi-target tracking framework. 
Our sensor control solution does not need any prior knowl¬ 
edge of the clutter distribution or the probability of detection 
profile. 

2. ROBUST MULTI-BERNOULLI FILTER 

Vo et al. [9] have recently tackled the problem of multi- 
Bernoulli filtering for cases where clutter intensity and de¬ 
tection probability profile are unknown. In this solution, the 
detection probability is augmented to the multi-target state, 
and propagated in time. A set of clutter generators is used 
to create hypothetical targets associated with clutter measure¬ 
ments. The transition and observation models for the clutter- 
associated targets are taken to be similar to actual targets. 
Those two types of targets form a hybrid space X = X u 
X*A) where (0) and (1) superscripts denote the space of clut¬ 
ter and actual targets, respectively. Their augmented multi¬ 
target state includes a state o (as the probability of detection) 
and a multi-Bernoulli set state X. 

The multi-Bernoulli random set of targets is the union of 
an ensemble of M Bernoulli sets, each having a probability 
of existence, A l ^ u \ and two single-object densities denoted 



by pM(“) (a, x) where u = 1 corresponds to objects that are 
actual targets, and u = 0 corresponds to clutters. 

In this method, the prior multi-target distribution at time 
k- 1 is used in the prediction step that incorporates models of 
the dynamic and birth of targets and clutter generators. The 
multi-target distribution is then updated using current sensor 
measurements—see [9] for details of prediction and update 
steps and formulas. In the following section, we explain how 
the above method can be modified to incorporate a sensor se¬ 
lection step. 

3. SENSOR CONTROL FRAMEWORK 

We formulate the sensor control problem in the POMDP 
framework in conjunction with a robust multi-Bernoulli fil¬ 
ter [9] to concurrently solve the sensor control and multi¬ 
object estimation problems with unknown clutter intensity 
and sensor FoV. 

POMDP is a generalized form of Markov decision pro¬ 
cess (MDP) which is suitable for sensor planning mani¬ 
fested by a series of control commands (actions). In POMDP 
framework, there is no direct access to the states and de¬ 
cisions are made using only uncertain observations. A 
POMDP at any time step k could be defined as a tuple; 
T' = {X kl E,J k \ k _ l {x k \x k -i),Z kl g k {z\x),'d{X k . ll s 1 X k )}, 
where X k is a finite set of single-object states; § defines a set 
of sensor control commands; f k \k-i( x k\%k-i) is a transition 
model for single-object state; Z k comprises a finite set of ob¬ 
servations; g k (z\x) is a stochastic measurement model; and 
d(X k -i ,s, Xk) is an objective function that returns a reward 
or cost for transition from the multi-object state Xk-i to the 
state Xk by applying an action command ss§. 

The purpose of the sensor control is to find the control 
command s that optimizes the objective function. In an infor¬ 
mation theoretic approach, the reward function depends on 
the distributions of X k ~\ and Xk, as well as the future mea¬ 
surements which are distributed according to the choice of 
the control command. Thus, the control command s is com¬ 
monly chosen to maximize the statistical mean of the reward 
function over all future measurements, 

s k = argmax {e Zk [t?(A fc _i, s, A fc )] }. (1) 

ssS 1 ’ 

4. ROBUST MULTI-BERNOULLI SENSOR 

CONTROL 

Suppose that at time k - 1, the posterior multi-object density 
is approximated by multi-Bernoulli RFS. Using this distri¬ 
bution, the predicted multi-Bernoulli state is computed. For 
each sensor command the updated multi-object distribution 
at time k is computed and denoted by 
where each density p k ^ u \-) is approximated by a set of 

support points with weights 


. Following [2], we implement Renyi diver¬ 
gence as the objective function. We note that the value of the 
objective function depends on future measurements which 
have stochastic variations even for a determined control com¬ 
mand. Following [10], we skip the substantial computations 
needed for averaging over all future measurements, and use 
the predicted ideal measurement set (PIMS) approach to 
generate the measurements required to update the predicted 
distribution in the decision making step. 

The PIMS is produced as follows. The cardinality and 
state of the multi-object RFS are estimated from the predicted 
multi-Bernoulli distribution. These will include actual tar¬ 
get estimates and points estimates corresponding to generated 
clutters. Using the estimated number and states of the objects, 
a set of noise-free measurements are created and denoted by 
Z. This set would include clutter measurements associated 
with clutter generators that are not discriminated from real ob¬ 
jects in the prediction step. 1 Thus, it is necessary to perform 
an initial update step, using Z as the measurement set. The 
number of target objects and their states are then estimated 
from this primarily updated distribution. Using the estimated 
target states, for each control command, s, a single noise-free 
measurement is generated for each single object, and the set 
of such measurements is taken as the PIMS, denoted by Z{s). 
For each control command, s, its corresponding PIMS, Z(s), 
is then used to update the multi-Bernoulli multi-object den¬ 
sity from which the objective function is calculated and the 
optimum control command is selected accordingly. 

In [2], Renyi divergence was employed as the generaliza¬ 
tion of the Kullback-Leibler divergence. Renyi divergence 
measures information gain between two densities: 

4(/ 1 ,/ 2 ) = ^ T io g f [/tpon^m] 1-0 ^. (2) 

a -1 J 

In a target tracking context, the first distribution is usually 
the updated distribution, it k+1 (X k+1 \Z v . k , « 0 :fc-i, %k+\, «fc), 
and the second distribution is the prediction distribution, 
Kk + i\k(Xk + i\Z 1 :k,u 0 :k-i). The a parameter tunes the em¬ 
phasis on each of the two distributions. Similar to [11], our 
sensor control method is based on computing the reward 
function for each control command, in which the updated 
multi-object distribution is the one updated using the PIMS. 
However, in the particle approximation of the reward function 
derived in [11], 

^ , 1 Zl 1 w i k [g k+1 (Z k+1 \X i k+1[k ,s k )r 

£>(s fc ) * -- log "p- 

a ~ 1 [Er=i^[» + i(z fe+ iix; +1|fc , Sfc )]] 

computation of the likelihood function ^ + i(Z/ c+ i|X^ +1 | fc , S&) 
needs the knowledge of clutter intensity and probability of 

deferring to [9], the predicted existence probabilities are not separately 
computed for u - 0,1 (see equation (11) in [9]). This is while the up¬ 
dated r^’s and p^ (•)’s are separately calculated for u = 0,1—see equa¬ 
tions (14)-( 19) in [9]. 



Algorithm 1 Monte Carlo sampling of a multi-Bernoulli dis¬ 
tribution with given parameters and particles. 

Inputs: probabilities of existence r = [r± ••• tmY ■> particles matrix 
PMxL max = [ x ij]> and number of Monte Carlo samples L. 

Outputs: L sets, each being a Monte Carlo sample of the multi- 
Bernoulli distribution, in the form of Xg = {x^i,..., X£, n }, where 
ri£ < M is the cardinality of the £-th set. 

1: From the size of the particles matrix, find M and L max . 

2: for t = 1, L do 
3: X t 0 

4: for i = 1, M do 

5: Generate tz ~ t/(0,1). 

6: if u < ri then 

7: Generate u ~ U(0, 1). 

8: j «- [-£/ m ax' y l• > A random index between 1 and L max . 

9. Xu u 

10: end if 

11: end for 

12: end for 


detection profiles—see equation (20) in [11]. In the absence 
of such information, we propose to directly compute the 
reward function by Monte Carlo sampling of the updated 
multi-object multi-Bernoulli distribution. 

Monte Carlo sampling of a general multi-Bernoulli dis¬ 
tribution can be performed as follows. Assume that a multi- 
Bernoulli distribution is given with parameters {r l .p l (-)}fz 1 
with particle approximation Pi(x) r * Y,j=\ w ,. 3 8(x - x tj ). 
Without loss of generality, we assume that after resam¬ 
pling, L max particles with equal weights are created for 
each Bernoulli component. Algorithm 1 shows our proposed 
pseudocode to generate L Monte Carlo samples of the above 
described multi-Bernoulli distribution. 

Let us denote the ensemble of Monte Carlo samples of the 
updated distribution by {Xt)\: =x where each sample is itself 
a set of particles Xf = {a^i,..., xe, ne }, in which ri£ < M 
is the cardinality of the Ath sample. Particle approximation 
of the updated distribution is given by the following linear 
combination of multi-object Dirac delta functions: 

7T fc+1 (X|£(«*)) * £ ]rS(X - Xf). (3) 

1=1 L 

For a sufficiently large number of Monte Carlo samples, we 
have: 


Vft(-), [ h{X)n k+l {X\Z{u k )) 5X = Y j \h{X t )+0{L- 1 ). 
J l=1 L 

(4) 


Substituting the arbitrary function h(X) in (4) with 

T — 7rfc + 1 i|-4- Y ) — 1 would lead to: 

L TT k+ 1 (X\Z(u k )) J 

^ log/ [Tr k+1 (X\Z(s k ))] a [7r fc+1 | fe (A)] 1_ “ SX 

= Htiz[nk + i\k{X l )lx k+ 1 {X l \Z{s k ))} 1 ~ a + OiL- 1 ). 

(5) 

We note that the left side of the above equation is same as 
the reward function, V{s k ), and thus it can be computed by 


discarding the 0(L~ 1 ) term. The multi-Bernoulli distribution 
terms can be directly calculated. For general multi-Bernoulli 
parameters {r ^,the density at X = Xf is given 
by [1] p. 369: 


7r( 0 ) = nffi(l _ r^). if \X(\ > M , ir(Xi) = 0,otherwise 


ir(X e ) = n(0) Y, n 

j=1 


1 — 7'(C') 


( 6 ) 


In order to compute the density terms p tj (-) in (6), we 
note that for the updated multi-Bernoulli density, the Monte 
Carlo samples are drawn using the particles of the updated 
single Bernoulli components—see Algorithm 1. More pre¬ 
cisely, when computing the updated multi-Bernoulli density 
TT k+ i(X(), the argument of each of the p lj (-) terms in (6), 
xtj, definitely coincides with one of the particles represent¬ 
ing the updated ij -th single-Bemoulli distribution pl+i(')- 
This particle is found and its weight is used for the den¬ 
sity term in (6). When computing the predicted density 
n k +i\k(Xe), the density term pw)(iy) is computed by ker¬ 
nel density estimation using Gaussian kernels centered at the 
predicted particles for the ij -th Bernoulli component. 


5. SIMULATION RESULTS 

A challenging non-linear multi-target tracking scenario, sim¬ 
ilar to the one reported in [5], is employed to evaluate the 
performance of the proposed robust multi-Bernoulli sen¬ 
sor control method. In this scenario, a mobile sensor is 
employed to manoeuvre in a dynamic environment of size 
1000m x 1000m. The sensor regularly scans the surveil¬ 
lance area and returns a set of bearing and range measure¬ 
ments corresponding to detected targets, each in the form of 
z k = \9 k , 91fc] T . A total of five targets appear in the scene and 
manoeuvre in the surveillance area. At each time step, k, the 
sensor location is denoted by s k = [x Sk y Sk ] J ■ The sensor 
enters the surveillance area at position (10m, 10m). 

We ran the proposed robust multi-Bernoulli sensor control 
method to estimate the number and locations of the targets for 
a sequence of 35 steps. Intuitively, the sensor control method 
is expected to select those sensor positions that are closer to 
existing targets and end up in the vicinity of them. The ratio¬ 
nale behind this expectation is that the noise power for range 
measurements increases with the distance and the detection 
probability decreases for large distances—see equations (8)- 
(9) in [5], 

Fig. 1 shows average estimates of cardinality and clutter 
intensity calculated by 200 Monte Carlo runs of the proposed 
method. It is observed that the clutter intensity estimates 
shown in Fig. 1(a) gradually approach the ground-truth value 
of 10 and fluctuate around it with a relatively small standard 
deviation. 

For the averaged cardinality estimates shown in Fig. 1(b), 
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ment process parameters would lead to inaccurate results and 
frequently missed targets. This is while without the need for 
such prior information, our method intrinsically adapts the 
multi-target filtering process to work best with the measure¬ 
ments received from the selected sensor command. 
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